Discretization of Continuous Attributes in Supervised Learning algorithms
نویسنده
چکیده
We propose a new algorithm, called CILA, for discretization of continuous attribute. The CILA algorithm can be used with any class labeled data. The tests performed using the CILA algorithm show that it generates discretization schemes with almost always the highest dependence between the class labels and the discrete intervals, and always with significantly lower number of intervals, when compared with other state-of-the-art discretization algorithms. The use of the CILA algorithm as a preprocessing step for a machine learning algorithm significantly improves the results in terms of the accuracy, which are better than by using other discretization algorithms.
منابع مشابه
Discretization Algorithm that Uses Class-Attribute Interdependence Maximization
Most of the existing machine learning algorithms are able to extract knowledge from databases that store discrete attributes (features). If the attributes are continuous, the algorithms can be integrated with a discretization algorithm that transforms them into discrete attributes. The paper describes an algorithm, called CAIM (class-attribute interdependence maximization), for discretization o...
متن کاملSPID4.5: A Selective Pseudo Iterative Deletion Discretization Algorithm for Machine Learning, Uncertain Reasoning & Pattern Recognition
Many machine learning algorithms developed for classification, prediction and uncertain reasoning cannot handle continuous features. To use them on real world data sets, continuous attributes must be discretized into small number of distinct ranges. Also discretization provides an insight into critical values in continuous attributes. In this work (SPID4.5), an improvement of our previously pub...
متن کاملDiscretization oriented to Decision Rules Generation
Many of the supervised learning algorithms only work with spaces of discrete attributes. Some of the methods proposed in the bibliography focus on the discretization towards the generation of decision rules. This work provides a new discretization algorithm called USD (Unparametrized Supervised Discretization), which transforms the infinite space of the values of the continuous attributes in a ...
متن کاملFast Class-Attribute Interdependence Maximization (CAIM) Discretization Algorithm
Discretization is a process of converting a continuous attribute into an attribute that contains small number of distinct values. One of the major reasons for discretizing an attribute is that some of the machine learning algorithms perform poorly with continuous attribute and thus require front-end discretization of the input data. The paper describes a Fast Class-Attribute Interdependence Max...
متن کاملAn Evolution Strategies Approach to the Simultaneous Discretization of Numeric Attributes
Many data mining and machine learning algorithms require databases in which objects are described by discrete attributes. However, it is very common that the attributes are in the ratio or interval scales. In order to apply these algorithms, the original attributes must be transformed into the nominal or ordinal scale via discretization. An appropriate transformation is crucial because of the l...
متن کامل